EXCAVATOR: a computer program for efficiently mining gene expression data.
نویسندگان
چکیده
Massive amounts of gene expression data are generated using microarrays for functional studies of genes and gene expression data clustering is a useful tool for studying the functional relationship among genes in a biological process. We have developed a computer package EXCAVATOR for clustering gene expression profiles based on our new framework for representing gene expression data as a minimum spanning tree. EXCAVATOR uses a number of rigorous and efficient clustering algorithms. This program has a number of unique features, including capabilities for: (i) data- constrained clustering; (ii) identification of genes with similar expression profiles to pre-specified seed genes; (iii) cluster identification from a noisy background; (iv) computational comparison between different clustering results of the same data set. EXCAVATOR can be run from a Unix/Linux/DOS shell, from a Java interface or from a Web server. The clustering results can be visualized as colored figures and 2-dimensional plots. Moreover, EXCAVATOR provides a wide range of options for data formats, distance measures, objective functions, clustering algorithms, methods to choose number of clusters, etc. The effectiveness of EXCAVATOR has been demonstrated on several experimental data sets. Its performance compares favorably against the popular K-means clustering method in terms of clustering quality and computing time.
منابع مشابه
EXCAVATOR: a computer program for ef®ciently mining gene expression data
Massive amounts of gene expression data are generated using microarrays for functional studies of genes and gene expression data clustering is a useful tool for studying the functional relationship among genes in a biological process. We have developed a computer package EXCAVATOR for clustering gene expression pro®les based on our new framework for representing gene expression data as a minimu...
متن کاملImproving the Inference of Gene Expression Regulatory Networks with Data Aggregation Approach
Introduction: The major issue for the future of bioinformatics is the design of tools to determine the functions and all products of single-cell genes. This requires the integration of different biological disciplines as well as sophisticated mathematical and statistical tools. This study revealed that data mining techniques can be used to develop models for diagnosing high-risk or low-risk lif...
متن کاملImproving the Inference of Gene Expression Regulatory Networks with Data Aggregation Approach
Introduction: The major issue for the future of bioinformatics is the design of tools to determine the functions and all products of single-cell genes. This requires the integration of different biological disciplines as well as sophisticated mathematical and statistical tools. This study revealed that data mining techniques can be used to develop models for diagnosing high-risk or low-risk lif...
متن کاملPrediction of Acid Mine Drainage Generation Potential of A Copper Mine Tailings Using Gene Expression Programming-A Case Study
This work presents a quantitative predicting likely acid mine drainage (AMD) generation process throughout tailing particles resulting from the Sarcheshmeh copper mine in the south of Iran. Indeed, four predictive relationships for the remaining pyrite fraction, remaining chalcopyrite fraction, sulfate concentration, and pH have been suggested by applying the gene expression programming (GEP) a...
متن کاملA New Method for Data Mining in Multimedia Environment
Most of the existing frequent item sets mining techniques are based up on Multimedia data mining. In this paper we propose a novel approach for frequent item sets mining using color, texture and shape. Frequent item set is an item set that satisfies minimum support. The data bases tested in the Multimedia Miner System is constructed. Each Image contains two descriptors: a feature descriptor and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Nucleic acids research
دوره 31 19 شماره
صفحات -
تاریخ انتشار 2003